186
14
The Nature of Living Things
Differences Between Prokaryotes and Eukaryotes (2)
Bacterial genomes consist of blocks of genes preceded by regulatory (promoter)
sequences. Eukaryotic DNA resembles a mosaic of the following: genes (segments
whose sequence codes for amino acids, also called exons, from expressed, or “coding
DNA”); 20 segments (called introns) that are transcribed into RNA, but then excised
to leave the final mRNA used as the template for producing the protein; many genes
are split into a dozen or more segments, which can be spliced in different ways to
generate variant proteins after translation; promoters (short regions of DNA to which
RNA, proteins, or small molecules may bind, modulating the attachment of RNA
polymerase to the start of a gene); and intergenomic sequences (the rest, sometimes
called “junk” DNA in the same sense in which untranslated cuneiform tablets may
be called junk—we do not know what they mean). This is schematically illustrated
in Fig. 14.3.
Although the DNA-to-protein processing apparatus involves much complicated
molecular machinery, some RNA sequences can splice themselves. This autosplic-
ing capability enables exon shuffling to take place, suggesting the combinatorial
assembly of exons qua irreducible codewords as the basis of primitive, evolving life.
Organisms other than prokaryotes vary enormously in the proportion of their
genome that is not genes. The intergenomic material may exceed by more than an
order of magnitude the quantity of coding DNA. Some of the intergenomic mate-
rial is specially named, notably repetitive DNA. The main classes are the short (a
few hundred nucleotides) interspersed elements (SINES), the long (a few thousand
nucleotides) interspersed elements (LINES), and the tandem (i.e., contiguous) repeats
(minisatellites and microsatellites, 21 variable-length tandem repeats (VNTR), etc.). 22
These features can be highly specific for individual organisms. Several diseases are
associated with abnormalities in the pattern of repeats; for example, patients suf-
fering from X syndrome have hundreds or thousands of repeated CGG triplets at a
locus (i.e., place on the genome) where healthy individuals have about 30. The rôle
of repetition in DNA is still rather mysterious. One can amuse oneself by creating
sentences such as “can a perch perch?” or “will the wind wind round the tower?”
or “this exon’s exon was mistranslated” 23 to show that repetition is not necessarily
nonsense. The genome of the fruit fly Drosophila virilis has millions of repeats of
three satellites, ACAAACT, ATAAACT, and ACAAATT (reading from the5 prime5' to the
3 prime3' end), amounting to about 10 Superscript 8108 base pairs (i.e., comparable in length to the entire
20 The exome is the complete set of exons of an organism’s genome.
21 So called because their abnormal base composition, usually greatly enriched in C–G pairs (CpG),
results in satellite bands appearing near the main DNA bands when DNA is separated on a CsCl
density gradient.
22 Archaeal and bacterial genomes contain clustered regularly interspaced short palindromic repeats
(CRISPR; see, e.g., Sander and Joung (2014)). They have found technological application as a way
of genome editing.
23 Most English dictionaries give only one meaning for exon, namely one of four officers acting as
commanders of the Yeomen of the Guard of the Tower of London.